Vectorial cloning system of DNA

ABSTRACT

The invention relates to a vectorial cloning system consisting of a sequence of nucleotides, containing a fusion sequence and one or several restriction endonucleases, recognition sites for restriction endonucleases cutting outside their recognition sites, in addition to containing one or several other restriction endonuclease recognition sites which can be used to clone a foreign protein as well as the sequence for the desired foreign protein, wherein the foreign protein sequence is directly located on the fusion sequence after a subsequent restriction with the restriction endonucleases, followed by religation.

The invention relates to a vector system for cloning.

As a rule, common vectors which are used for cloning in prokaryotic systems contain the following features: a selection gene, e.g. the gene encoding resistance to ampicillin, a marker gene which makes it possible, e.g. on the basis of a color reaction, as in the case of the lacZ gene, to distinguish vectors with and without insert, and, especially, an origin of replication. The reader is also referred to the following documents with regard to the state of the art:

a) EP 0 532 043 A2,

b) EP 0 466 332 A2,

c) EP 0 293 249 A1,

d) GB 22 12 160 A and

e) U.S. Pat. No. 51 96 524.

The vector system described in publication c consists of a sequence of nucleotides which codes for the expression of a fusion protein. However, the foreign protein sequence in this case is linked directly to the sequence for an enzyme and is not separated by additional structures during the cloning process.

In systems which are based on phages rather than plasmids, further genetic elements are included which are important for the functions of the life cycle of the phage. Cleavage sites which occur only a few times in the vector, preferably only once, are normally used in these systems for inserting a foreign sequence into a marker gene. A series of such cleavage sites is normally arranged in a so-called multiple cloning site. Further elements, apart from this multiple cloning site, are included in some vectors which are used for expressing foreign proteins. In some genes, use is made, for this purpose, of a sequence which results, together with the insert to be cloned, in a fusion protein which then has an affinity for, e.g., maltose residues or nickel chelates. In some cases, these residues are joined by a recognition sequence for a proteinase, e.g. factor Xa. After purification by means of affinity chromatography has been effected, these proteinase cleavage sites make it possible to cleave the fusion protein into the affinity moiety and the foreign protein which is actually to be expressed. Since, however, this proteinase cleavage site is as a rule followed by a multiple cloning site sequence into which the sequence to be expressed has been cloned, additional, frequently unwanted amino acids remain on the foreign protein to be expressed after the fusion protein has been subjected to the processing with proteinase.

Most foreign protein sequences which are to be expressed are present in a nucleic acid environment which does not allow direct cloning directly after an endoproteinase recognition sequence to be effected efficiently using current cloning strategies. Using a multiple cloning site immediately after the endoproteinase recognition sequence (i.e. 3′ of the endoproteinase recognition sequence) markedly facilitates cloning of foreign protein sequences which are to be expressed. In this connection, it is disadvantageous that, at the protein level, it is not the foreign protein to be expressed which is recovered following digestion with the enzyme which recognizes the corresponding protein sequence; instead, that which is recovered is a fusion protein which comprises the foreign protein sequence to be expressed together with additional amino acids which are encoded by the residues of the multiple cloning site.

The object of the invention is to provide a novel vector system for cloning foreign proteins which avoids the disadvantages of the state of the art. This object is achieved by the features of claims 1 and 10. Advantageous embodiments ensue from the features of claims 2-9 and 11.

Use of the cloning vector system which is presented here now makes it possible, in a subsequent digestion and religation step, to bring the foreign protein sequence to be expressed directly up against an endoproteinase recognition sequence. This ensures that, after digestion with the enzyme which recognizes the corresponding protein sequence, the foreign protein sequence to be expressed is released, without any further components, from an expressed fusion protein.

A particular advantage is also to be seen in the fact that it is possible to select the reading frame at will during the process which is presented here. This makes it possible not only to exactly position the foreign gene component to be expressed but also to define the start of the fusion gene component. The following strategy was selective for preventing, in a simple but nevertheless highly selective manner, the unwanted expression, occasioned by the cloning procedure, of additional peptide components on the foreign gene to be expressed:

Use was made of the recognition sequence for the enzyme BcgI in order to clone it within the multiple cloning site, into which the foreign sequence can subsequently be inserted using any arbitrary enzymes. BcgI is one of the enzymes, so far the only one which is commercially available, which cleave both upstream and downstream of their recognition sequence at a defined distance (10/12 nucleotides) from this sequence.

This cleavage by the enzyme consequently removes a sequence segment of 2×6+10+12=34 bp. Subsequent religation consequently generates sequences from which 34 base pairs have been removed. Consequently, if a specific cloning has been carried out, the last nucleotide of the endoproteinase recognition sequence and the first nucleotide of the first codon of the protein to be cloned, for example, then lie directly adjacent. Examples of sequences are given in Example 1 in association with a listing of some applications.

Particular advantages of the BcgI system include its tendency to lead in 10-50% of cases, depending on the buffer concentration, to small deletions (3 nucleotides as a rule) at the cleavage site. This then results, in these cases, in the first amino acid of the foreign protein to be expressed, as a rule methionine, no longer being a constituent of the foreign protein, which is to be expressed and which is released, after digestion with the enzyme which recognizes the corresponding protein sequence.

A similar result is obtained when two restriction endonuclease recognition sites are used; i.e. with one of the sites being in, or in the immediate vicinity of, the region of an endoproteinase recognition sequence and the other recognition site, which is ligation-compatible with the former, being located directly upstream of the foreign protein sequence to be expressed such that digestion with the corresponding restriction endonuclease and subsequent religation brings the beginning of the foreign protein sequence to be expressed directly up against the endoproteinase recognition sequence.

The invention is explained below with the aid of some sequences and examples.

FIG. 1 shows a first sequence,

FIG. 2 shows a second sequence of the multiple cloning site,

FIG. 3 shows a third sequence using BamHI as an example,

FIG. 4 shows a fourth sequence using ClaI as an example,

FIG. 5 shows one of the preceding sequences following restriction and religation,

FIG. 6 shows a fifth sequence, namely a fusion sequence, and

FIG. 7 shows a sixth sequence which contains a BcgI recognition site.

The first sequence, shown in FIG. 1, permits cloning in all three reading frames.

1. If, for example, a Not I or AscI recognition site is located directly upstream of the ATG of a foreign gene to be expressed, this enzyme can be used and the fragment can be filled in to give a blunt end and cloned on to the starting construct which has been cut with HincII. This is shown in FIG. 2.

2. If, for example, any arbitrary restriction recognition site which generates a 4-nucleotide 3′ -overhanging end is located directly upstream of the ATG of a foreign gene to be expressed, this enzyme can then be used, and the fragment can be filled in to give a blunt end and cloned on to the starting construct which has been cut with AccI and as likewise been filled in. This is shown in FIG. 3 using BamHI as an example.

3. If, for example, any arbitrary restriction recognition site which generates a 2-nucleotide 3′ -overhanging end is located directly upstream of the ATG of a foreign gene to be expressed, this enzyme can then be used and the fragment can be filled in to give a blunt end and cloned on to the starting construct which has been cut with SalI and is likewise been filled in. This is shown in FIG. 4 using ClaI as an example.

Subsequent restriction with BcgI and religation gives rise, in each of the cases, to a sequence of the type shown in FIG. 5.

In this sequence, the first nucleotide of the sequence to be expressed comes directly after the Xa protease recognition site.

EXAMPLE 1 Design of the Vector System

Use was made of an inducible prokaryotic promoter system, i.e. the IPTG-inducible lacZ system, as is present in the vector pQE30 supplied by Qiagen. In this vector, the first protein-encoding sequence following the promoter is a sequence of 6 histidine residues. In our construct, this sequence is followed directly by a cleavage site for the Xa endoproteinase. The corresponding fifth sequence is shown in FIG. 6.

In our construct, this is then followed by a multiple cloning site which contains a BcgI recognition site. This is shown in FIG. 7.

Since a BcgI site is also present in the ampicillin resistance-encoding region of the starting vector which we used, this cleavage site, which is present in this region, was destroyed by in vitro mutagenesis while leaving the ampicillin resistance intact.

EXAMPLE 2

Cloning a foreign protein, in this case the polyoma virus structural protein VP1, into this cloning vector system.

VP1 contains a (BamHI) cleavage site which is suitable for cloning at a distance of 6 nucleotides upstream of its methionine start codon. A further, SphI, cleavage site which is required for cloning follows at a distance of 59 nucleotides after its coding sequence. The cloning vector system, as well as the construct which encompasses the VP1-encoding region, were cut with the suitable restriction endonucleases, i.e. AccI/blunt and SphI. The coding sequence was ligated into the cloning vector system which had been prepared in this way and transformed into, and amplified by growth in, E. coli cells (XL1 blue (lacIq), prevents expression in the absence of IPTG). Substantial quantities of the plasmid construct which had been prepared in this way were isolated and purified for further processing.

The purified construct was now digested with the enzyme BcgI, separated in an agarose gel and purified; it was then religated and once again transformed into bacteria which are particularly suitable for expression (RB791, a derivative of W3110 possessing lacIqL8), and amplified in these bacteria.

In three independent clones, it was found that the coding sequence of the VP1 protein, beginning with methionine, was located directly after the last amino acid of the factor Xa cleavage site (IleGluGlyArg)(SEQ ID NO:1).

In one clone, it was observed that, while the coding sequence of the VP1 protein followed the factor Xa sequence, the three nucleotides which encode methionine had been deleted. Similar results were obtained when the experiments were repeated with other proteins.

EXAMPLE 3 Expressing a Foreign Protein

Constructs which encoded a fusion protein containing VP1 were transformed into bacterial cells (RB79 1) and the cells were grown in an overnight culture. This overnight culture was used to grow bacterial cultures up to a density of approx. 0.8A₆₀₀. Expression of the fusion protein, encompassing the histidine residues, the factor Xa recognition site and VP1, was then induced by adding IPTG. After 6 hours of induction, the total protein was harvested and an aliquot was loaded onto a gel for checking the induction efficiency. The major portion of this protein mixture was bound to a nickel chelate column in accordance with manufacturer's (Qiagen) instructions and washed on the column with various buffers. The fusion protein can be eluted from the column using a solution which contains 50 mM EGTA. However, in our case, the pure expression protein VP1 was released by subsequently digesting while adding the endoproteinase factor Xa.

11 1 4 PRT Unknown amino-terminal 4 amino acids of the Factor Xa cleavage site (unknown source) 1 Ile Glu Gly Arg 1 2 71 DNA Unknown nucleotide sequence from a portion of a vector, including part of an affinity (e.g., fusion) protein and a multiple cloning site (unknown source) 2 cacggatcaa tcgaaggacg catcagcctg gtccgagctg agtgcagtcg accgcatgcg 60 agctcggtac c 71 3 7 PRT Unknown amino acid sequence encoded by the affinity (e.g., fusion) protein 3 His Gly Ser Ile Glu Gly Arg 1 5 4 63 DNA Unknown nucleotide sequence from a portion of a vector, including part of an affinity (e.g., fusion) protein and a multiple cloning site (unknown source) 4 cacggatcaa tcgaaggacg catcagcctg gtccgagctg agtgcagtcg gccgcatgnn 60 nnn 63 5 63 DNA Unknown nucleotide sequence from a portion of a vector, including part of an affinity (e.g., fusion) protein and a multiple cloning site (unknown source) 5 cacggatcaa tcgaaggacg catcagcctg gtccgagctg agtgcagtcg gatccatgnn 60 nnn 63 6 63 DNA Unknown nucleotide sequence from a portion of a vector, including part of an affinity (e.g., fusion) protein and a multiple cloning site (unknown source) 6 cacggatcaa tcgaaggacg catcaccctc ctccgacctc actgcagtcg acgatatgnn 60 nnn 63 7 42 DNA Unknown nucleotide sequence from a portion of a vector, including part of an affinity (e.g., fusion) protein and a multiple cloning site (unknown source) 7 catcaccatc accatcacgg atcaatcgaa ggacgcatgn nn 42 8 13 PRT Unknown amino acid sequence encoded by the affinity (e.g., fusion) protein 8 His His His His His His Gly Ser Ile Glu Gly Arg Met 1 5 10 9 36 DNA Unknown nucleotide sequence from a portion of a vector, including part of an affinity (e.g., fusion) protein and a multiple cloning site (unknown source) 9 catcaccatc accatcacgg atcaatcgaa ggacgc 36 10 12 PRT Unknown amino acid sequence encoded by the affinity (e.g., fusion) protein 10 His His His His His His Gly Ser Ile Glu Gly Arg 1 5 10 11 71 DNA Unknown nucleotide sequence from a portion of a vector, including part of an affinity (e.g., fusion) protein and a multiple cloning site (unknown source) 11 cacggatcaa tcgaaggacg catcagcctg gtccgaggtg agtgcagtcg accgcatgcg 60 agctcggtac c 71 

what is claimed is:
 1. A vector system for cloning comprising (a) a sequence encoding an affinity element, (b) one or more restriction endonuclease recognition sites for restriction endonucleases which cut outside their recognition sites, (c) one or more further restriction endonuclease recognition sites which can be used for cloning nucleic acid sequences encoding a desired foreign protein, and (d) a sequence encoding said foreign protein, wherein said restriction endonuclease recognition site(s) for restriction endonucleases which cut outside their recognition sites is/are designed such that, after restriction, the sequence encoding said foreign protein can be religated directly to said sequence encoding said affinity element.
 2. The vector system of claim 1, wherein said sequence encoding said affinity element comprises an endoproteinase recognition sequence.
 3. The vector system of claim 1, wherein, when several restriction endonuclease recognition sites for restriction endonucleases which cut outside their recognition sites are used, these sites are recognized by the same enzyme or by isoschizomeric enzymes.
 4. The vector system of claim 1, wherein the restriction endonuclease recognition sites for restriction endonucleases which cut outside their recognition sites are two different, but ligation-compatible restriction endonuclease recognition sites.
 5. The vector system of claim 1, wherein only one restriction endonuclease recognition site for restriction endonucleases which cut outside their recognition sites is present, said recognition site resulting in two cleavage sites by said restriction endonuclease.
 6. The vector system of claim 5, wherein the restriction endonuclease recognition site for restriction endonucleases which cut outside their recognition sites is a recognition site for BcgI.
 7. A kit for cloning foreign proteins, said kit comprising the vector system of claim
 1. 8. A process for producing a desired foreign protein, said process comprising (a) providing the vector of claim 1, (b) restricting said vector with one or more of said restriction endonucleases which cut outside their recognition sites, (c) religating said sequence for said foreign protein directly to said fusion sequence, and (d) expressing a fusion protein from said vector, said fusion protein comprising said desired foreign protein.
 9. The process of claim 8, wherein, after the fusion protein has been expressed, the desired foreign protein is obtained by cleaving the fusion protein at an endoprotease cleavage site which is located at the end of the fusion sequence.
 10. The process of claim 9, wherein the N-terminal first amino acid is deleted from the desired foreign protein by using BcgI as the restriction endonuclease which cuts outside its recognition site. 